Search results for "machine listening"

showing 3 items of 3 documents

An Open-set Recognition and Few-Shot Learning Dataset for Audio Event Classification in Domestic Environments

2020

The problem of training with a small set of positive samples is known as few-shot learning (FSL). It is widely known that traditional deep learning (DL) algorithms usually show very good performance when trained with large datasets. However, in many applications, it is not possible to obtain such a high number of samples. In the image domain, typical FSL applications include those related to face recognition. In the audio domain, music fraud or speaker recognition can be clearly benefited from FSL methods. This paper deals with the application of FSL to the detection of specific and intentional acoustic events given by different types of sound alarms, such as door bells or fire alarms, usin…

FOS: Computer and information sciencesComputer Science - Machine LearningSound (cs.SD)sound processingaudio datasetmachine listeningUNESCO::CIENCIAS TECNOLÓGICASComputer Science - SoundMachine Learning (cs.LG)classificationArtificial IntelligenceAudio and Speech Processing (eess.AS)Signal ProcessingFOS: Electrical engineering electronic engineering information engineeringfew-shot learningopen-set recognitionComputer Vision and Pattern RecognitionSoftwareElectrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

Acoustic Scene Classification with Squeeze-Excitation Residual Networks

2020

Acoustic scene classification (ASC) is a problem related to the field of machine listening whose objective is to classify/tag an audio clip in a predefined label describing a scene location (e. g. park, airport, etc.). Many state-of-the-art solutions to ASC incorporate data augmentation techniques and model ensembles. However, considerable improvements can also be achieved only by modifying the architecture of convolutional neural networks (CNNs). In this work we propose two novel squeeze-excitation blocks to improve the accuracy of a CNN-based ASC framework based on residual learning. The main idea of squeeze-excitation blocks is to learn spatial and channel-wise feature maps independently…

FOS: Computer and information sciencesSound (cs.SD)Computer Science - Machine LearningGeneral Computer ScienceCalibration (statistics)Computer scienceResidualConvolutional neural networkField (computer science)Computer Science - SoundMachine Learning (cs.LG)030507 speech-language pathology & audiology03 medical and health sciencesAudio and Speech Processing (eess.AS)Acoustic scene classificationFeature (machine learning)FOS: Electrical engineering electronic engineering information engineeringGeneral Materials ScienceBlock (data storage)Artificial neural networkbusiness.industrypattern recognitionGeneral Engineeringdeep learningPattern recognitionmachine listeningsqueeze-excitationArtificial intelligencelcsh:Electrical engineering. Electronics. Nuclear engineering0305 other medical sciencebusinesslcsh:TK1-9971Electrical Engineering and Systems Science - Audio and Speech Processing
researchProduct

A case study on feature sensitivity for audio event classification using support vector machines

2016

Automatic recognition of multiple acoustic events is an interesting problem in machine listening that generalizes the classical speech/non-speech or speech/music classification problem. Typical audio streams contain a diversity of sound events that carry important and useful information on the acoustic environment and context. Classification is usually performed by means of hidden Markov models (HMMs) or support vector machines (SVMs) considering traditional sets of features based on Mel-frequency cepstral coefficients (MFCCs) and their temporal derivatives, as well as the energy from auditory-inspired filterbanks. However, while these features are routinely used by many systems, it is not …

Machine listeningComputer sciencebusiness.industryEvent (computing)Speech recognitionFeature extractionContext (language use)Pattern recognition02 engineering and technologySupport vector machine030507 speech-language pathology & audiology03 medical and health sciencesComputingMethodologies_PATTERNRECOGNITION0202 electrical engineering electronic engineering information engineeringFeature (machine learning)020201 artificial intelligence & image processingArtificial intelligenceMel-frequency cepstrum0305 other medical sciencebusinessHidden Markov model2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP)
researchProduct